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AUTOSTANDARDIZED COUNTS BASED ON A 
500 BIN WINDOW (SHIFTED BY 200 & 400 UNITS 
FOR THE 460 AND 760 CURVES, RESPECTIVELY ) 
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FIG. 8 



RESCUE MASS SPECTRUM DATA SO THAT PERIOD 
IS 1amu W^| /?£S0|L£D " ( M/Z \ c 



\CTUAL 



403 



INTERPOLATE COUNTS TO GENERATE EVENLY SPACED DATA 
(e.g. TO ASSURE THERE ARE SUBSTANTIALLY NUMBER OF 
BINS FOR ALL BLOCKS IN THE MASS SPECTRUM DATA) 
FOR EXAMPLE, CALCULATE 



COUNTS 
(INTERPOLATED) 



[(WlNTP -^W/Z^LO^] [f OUW7S | H/GH - CWN7S | LOlv ] 



[(M/Z)\ 

L \high 



-(M/Z) 



\low\ 



+COUNTS 



LOW 
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NORMALIZE COUNTS (TO SUBTRACT OUT ERROR/ 
NOISE IN BASEUNE) 
FOR EXAMPLE. CALCULATE 

COUNTS I =COUNTS, lhmrRpni .—...-COUNTS \ 

\NORMAUZED (INTERPOLATED) \LOCAL MIN. 
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DEVELOP AVERAGE FILTER (DECONVOLUTION) KERNEL 
e.g. 



COUNTS 



KERN 



Fn BLOCKS 
(COL 



COUNTS 



'.J 

NORM 



(COUNTS -COUNTS 



Vmin) — 



^BLOCKS 



N BINS . 

DETERMINE KERNEL ERROR — £/W?0/?=X G'kERN 

i=1 

AND ITERATE ON f TO MINIMIZE ERROR 
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FILTER OPTIMAL KERNEL FORM COUNTS, e.g. 

COUNTSl' ^COUNTS I' -CONSTANT X COUNTS\' 

\F1LTERED WORM \KERN 



LABEL — |ll ]£ % I S tS I 
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MAJOR FRAGMENTS 
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OBTAIN PROTEIN EXTRACT 
(e.g. A CELLULAR/TISSUE EXTRACT 
CONTAINING MORE THAN 100 PROTEINS) 
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LABEL PROTEINS WITH COVALENT 
MASS LABEL 
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SEPARATE LABELLED PROTEINS 
(e.g. USE ELECTROPHORESIS — 
ONE OR TWO DIMENSIONAL DEPENDING 
ON NUMBER OF PROTEINS IN SAMPLE; 
CAPILLARY OR GEL ELECTROPHORESIS 
MAY BE USED DEPENDING ON 
MASS LABEL USED) 



207 



DETERMINE COMPLETE OR PARTIAL PROTEIN 
SEQUENCE FOR EACH SEPARATED LABELLED 
PROTEIN BY MASS SPECTROMETRY (e.g. 
ELECTROSPRAY USED IF PROTEINS ARE 
SEPARATED BY CAPILLARY ELECTROPHORESIS; 
MALDI USED IF PROTEINS ARE SEPARATED 
BY GEL ELECTROPHORESIS) 



FIG. 11 
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F/G. 72 



LABEL PROTEINS/POLYPEPTIDES 
(e.g. LABEL ONE OR MANY EXTRACTED 
PROTEINS WITH A MASS LABEL) AND 
ISOLATE EACH LABELLED PROTEIN/ 
POLYPEPTIDE 



251 
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PERFORM COLLISION INDUCED, IN SOURCE, 
MASS SPECTROMETRY FOR EACH ISOLATED 
PROTEIN/POLYPEPTIDE 
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TRANSMIT DATA SAMPLE REPRESENTING 
MASS SPECTRUM FROM MASS SPECTROMETER 
TO PROCESSING SYSTEM 
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PROCESS FILTERED DATA AUTOMATICALLY 
TO OBTAIN PROTEIN SEQUENCE (OR PORTION 
OF PROTEIN SEQUENCE SUCH AS A PST- 
PROTEIN SEQUENCE TAG) 
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STORE A PREDETERMINED SET OF MASS/CHARGE 
(M/Z) VALUES FOR AMINO ACID SEQUENCES 
(e.g. STORE M/Z VALUES FOR ALL POSSIBLE 
EXPECTED FRAGMENTS OF LABELLED TERMINAL 
PORTIONS OF ALL POSSIBLE PROTEINS) 
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DETERMINE AN ABUNDANCE VALUE FROM SAID 
MASS SPECTRUM DATA FOR EACH M/Z VALUE 
IN THE PREDETERMINED SET OF M/Z VALUES 


~ ^303 
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CALCULATE A FIRST RANKING (e.g. 
PROBABILITY), BASED ON THE ABUNDANCE 
VALUES, FOR EACH SEQUENCE OF A SET OF 
AMINO ACID (AA) SEQUENCES HAVING A 
FIRST NUMBER OF AAs (e.g. 3 AAs) 








CALCULATE A SECOND RANKING (e.g. 
PROBABILITY), BASED ON THE ABUNDANCE 
VALUES, FOR EACH SEQUENCE OF A SET OF 
AA SEQUENCES HAVING A SECOND 
NUMBER OF AAs (e.g. 4 AAs) 


307 








CALCULATE A CUMULATIVE RANKING (e.g. A ' 
CUMULATIVE PROBABILITY), BASED ON THE 
FIRST RANKING AND THE SECOND RANKING, 

FOR EACH SEQUENCE OF A SET OF AA 
SEQUENCES HAVING AT LEAST THE SECOND 
NUMBER OF AAs 


309 



FIG. 13 
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FIG. UA 
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FOR EACH POSSIBLE EXPECTED FRAGMENT OF A 
TERMINALLY LABELLED SEQUENCE OF AAs, PERFORM 
LOOKUP IN MASS SPECTRUM DATA FOR ABUNDANCE 
VALUE AT M/Z VALUE'fOF POSSIBLE EXPECTED FRAGMENT 
(** C Q.k.l - LOOKUP [(M/Z) UXI ]) 
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DETERMINE MASTER COUNT. FOR EACH PARTICULAR 
POSSIBLE SEQUENCE, OVER ALL POSSIBLE ION 
TYPES AND CHARGE STATES FOR PARTICULAR SEQUENCE 



e.g. 



cM.= 

'.J 



WHERE 



MAX 

CHARGE 

STATES 



l=MIN 

CHARGE 

STATES 



(MAX 
ION 
TYPES 
X C i,j,k,l 
k=1 



'^NUMBER OF AMINO ACIDS (AAs) IN SEQUENCE 
j=NUMBER OF POSSIBLE SEQUENCES (USUALLY, 19') 
i-NUMBER OF ION TYPES FOR EACH RESIDUE 
(e.g. ION TYPES a.b, AND POSSIBLY c AT 
N TERMINUS AND ION TYPES x,y, AND POSSIBLY 
z AT C TERMINUS) 
k=NUMBER OF CHARGE STATES FOR EACH ION TYPE 



FOR EACH MASTER COUNT (eft) FOR A PARTICULAR 

POSSIBLE SEQUENCE AT A GIVEN SEQUENCE LENGTH, 
COMPARE THE MASTER COUNT TO ALL OTHER MASTER 
COUNTS FOR ALL OTHER POSSIBLE SEQUENCES FOR 
THE GIVEN SEQUENCE LENGTH TO PRODUCE A RANKING 
OF MASTER COUNTS OF ALL POSSIBLE SEQUENCES AT 
THE GIVEN LENGTH OF AAs. FOR EXAMPLE, THE 
COMPARISON MAY BE A RANKING OF PROBABILITIES 
WHERE EACH PROBABILITY IS A PROBABILITY OF A 
PARTICULAR SEQUENCE AT THE GIVEN SEQUENCE 
LENGTH RELATIVE TO ALL OTHER POSSIBLE 
SEQUENCES AT THE SAME LENGTH 



TO FIG. 14 B 
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FIG. U B 



FROM FtO.UA 
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IN ONE EXAMPLE, DETERMINE Pi (A FIRST 
RANKING) EACH PART1CULUAR POSSIBLE SEQUENCE 
j AT A GIVEN AA LENGTH i: 



Pi=NORMAL DISTRIBUTION 
WHERE: 



AND 



19' 



Jss1 IF 



(19'-1) 



DETERMINE CUMULATIVE RANKING FOR A GIVEN 
SEQUENCE OVER DIFFERENT SEQUENCE 
LENGTHS — e.g. CALCULATE 

n A D 

R j.n= TT Pi OR =Z p i 
1=1 i=1 

e.g. p r r M 1 +pV r M "l _ CUMULATIVE 

* \?2,}=A-A\ ir l C 3J=A-A-x] ~ RANKING 
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SELECT SEQUENCE WITH HIGHEST 
CUMULATIVE RANKING 
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FIG. 18 A 



DETERMINE MOLECULAR WEIGHT (MW) OF A 
PARTICULAR RESIDUE/SEQUENCE (e.g. LABEL 
-Ala WHICH IS j=Ala) FROM NUMBER 
OF ATOMS (N) OF EACH ELEMENT AND MW 
>.,' OF ELEMENT— e.g. 

MW seq =N c x MWc rfr# H x MW H +N N x MW^-h... 
(WHERE N c =r> NUMBER)!OFk 1CARB0N ATOMS </A/< RESIDUE 
MW C =12 (WEIGHT OF CARBON), etc.) 



451 



DETERMINE (OR LOOKUP STORED) WEIGHT 
ADJUSTMENTS FOR ION TYPES OF THE 
PARTICULAR RESIDUE/SEQUENCE AND 
CALCULATE ADJUSTED MWs FOR EACH 
DIFFERENT ION TYPE FROM MW AND 
WEIGHT ADJUSTMENT 
(e.g. ADJ a MWseq =MW se q -a(adj): 
ADJbMWseq =WWfeeq ~b(adj):) 



453 



DETERMINE (OR LOOK UP) POSSIBLE CHARGE 
STATE ADJUSTMENTS FOR EACH ADJ a MW $eq , 

ADJbMW seq , etc. AND CALCULATE CORRES- 
PONDING M/Zs FROM THE ADJ Q MW seQ , etc. 

DATA AND THE CHARGE STATE ADJUSTMENT 
DATA — e.g. 
M/Z=ADJ a MW seq + Z/ MH H ; 

e.g., THIS PRODUCES A SET OF M/Z VALUES 
FOR i=1, j=Ala FOR ALL POSSIBLE k 
AND I COMBINATIONS 



SAVE (TEMPORARILY) IN L1 OR L2 CACHE 
(OR /nP'S SCRATCH PAD MEMORY OF POSSIBLE) 
CURRENT SET OF M/Z VALUES 
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TO FIG.18B 



FROM FIG.18A 
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PERFORM LOOKUP IN MASS SPECTRUM DATA 
AT EACH M/Z VALUE IN THE SAVED CURRENT 
SET OF M/Z VALUES TO OBTAIN ABUNDANCE 
VALUE AT EACH M/Z 



461 




ERASE CURRENT SET OF M/Z VALUES IN 
CACHE (NOTE: MAY ERASE BY WRITING 
NEW CURRENT SET IN LATER ITERATION) 
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REPEAT M/Z CALCULATIONS FOR NEXT POSSIBLE 

SEQUENCE (AND REPEAT M/Z CALCULATIONS 
FOR ALL OTHER POSSIBLE TERMINAL SEQUENCES 
UP TO n AAs IN LENGTH) 
(RETURN TO 451) 



FIG. 18 B 



ACCUMULATE A COUNT SUM AS DETERMINE EACH 
COUNT (C,jxi) FROM LOOKUP OPERATION 
(e.g. SUMj= PRIOR SUMi+Cjjkl) 
FOR ALL POSSIBLE SEQUENCES OF A NUMBER/ 
LENGTH OF AAs (e.g. 1=4 AAs) 



ACCUMULATE A (C0UNT) 2 SUM AS DETERMINE 
EACH COUNT (C r]kl ) FROM LOOKUP OPERATION 

(e.g. SUMSQ } =PRIOR SUMSQ; + (Cn k j) 2 ) 

FOR ALL POSSIBLE SEQUENCES OF THE ' NUMBER/ 
LENGTH OF AAs (i) 



AFTER ITERATING THROUGH EACH LOOKUP 
OPERATION AT ALL POSSIBLE M/Z VALUES 
IN SET OF M/Z VALUES. 
CALCULATE MEAN (C;) AND STANDARD 
DEVIATION (CT;) FOR GIVEN NUMBER/LENGTH (I) 

*' g ' SUM: _ SUMSQj -(C;) 2 
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n-1 



FOR ALL POSSIBLE AA SEQUENCE OF LENGTH i 
PERFORM LOOKUP OPERATIONS FOR ALL 
POSSIBLE M/Z VALUES FOR THAT POSSIBLE 
AA SEQUENCE AND DETERMINE RANKING FOR 
THAT SEQUENCE (e.g. RANKING=P;j , WHERE 



P tt ^PROBABILITY DISTRIBUTION 
u FUNCTION 



AND SAVE EACH RANKING (e.g. IN NON-VOLATILE 
MEMORY) IN ORDER TO CALCULATE CUMALATIVE 

RANKINGS 



FIG. 19 




FIG. 20 A 
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FIG. 33 



5- Br -3 -PA A - GLSDGE 
(TRUE 6 RES I DUE 
SEQUENCE) 



5-Br-3-PAA-GLSDW 
(COMPETING 5 RES/DUE 
FALSE SEQUENCE) 
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RAW MASS SPECTRUM 

N- TERMINAL b-ION POSITIONS 
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BASE LI NED MASS SPECTRUM 
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FIG. 36 



TCAGTGCTGCTGCAACATGTTACAGGAAAT 



'UNKNOWN' SEQUENCE 



M13 PRIMER 



A G TCA CGA CGA CGTTG TrA 
TCAGTGCTGCTGCAACATGTTACAGGAAAT 



DNA POLYMERASE 
dNTP MIX 
ddA TP * (SEE FIG. 37A) 
ddGTP*(SEEFIG.37B) 



RNAse, 

DENATURATION 



PRIMER 
REMOVAL 



PRIMER 
REMOVAL 
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DNA POLYMERASE 

dNTP MIX 

ddTTP*(SEEFlG.37C) 
ddCTP*<SEEFIG.37D) 



RNAse , 

DENATURATION 
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